Objectives

  • Implement the Metropolis algorithm for a one-parameter model, and apply it to generate a posterior parameter sample, for the model “survival”.
  • Calculate a Markov chain sample of the posterior model parameter distribution. Starting from a first guess jump distribution, construct a simple algorithm to improve the jump distribution for higher efficiency.
  • Use graphical analysis and convergence tests to asses the quality of the Markov chain samples.

1. Implement an elementary Markov Chain sampler ★

In this exercise, you will write your own code to implement the most elementary MCMC sampler, the Metropolis algorithm for one dimension. MCMC samplers can be used to sample from any distribution, and they are not linked to Bayesian concepts per se. In Bayesian inference, however, they are especially useful to sample from the posterior distribution. You will apply your self-coded sampler to sample from the posterior distribution of the one-parameter model “survival”, given the observed data in the file model_survival.csv.

Before you get started

We want to use the model “Survial” from yesterday’s exercise and the data model_survival.csv to test our sampler. The experiment was started with N=30 individuals (this information is not contained in the data set).

Reading and plotting the data

It’s always a good idea to look at the data first.

Hints

R
data.survival <- read.table("../../data/model_survival.csv", header=TRUE)
barplot(data.survival$deaths, names=data.survival$time,
        xlab="Time interval", ylab="No. of deaths",
        main="Survival data")

Julia
using DataFrames
import CSV
using Plots
survival_data = CSV.read("../../data/model_survival.csv",
                         DataFrame);

bar(survival_data.deaths,
    labels = false,
    xlabel = "Times interval",
    ylabel = "No. of deaths",
    title = "Survival data")

Python
import pandas as pd
import matplotlib.pyplot as plt

# Loading data
# Specify the path to the CSV file
file_path = r"../../data/model_survival.csv"

# Load the CSV file into a pandas DataFrame
data_survival = pd.read_csv(file_path, sep=" ")

plt.bar(data_survival['time'], data_survival['deaths'])

# Set the x and y labels and title
plt.xlabel("Time interval")
plt.ylabel("No. of deaths")
plt.title("Survival data")

# Display the plot
plt.show()

Defining the likelihood, prior, and posterior

In the next step we will need the posterior density (or more accurately: a function that is proportional to the posterior density) for the model “survival” because \[ p(\lambda|y)\propto p(\lambda,y) = p(y|\lambda)p(\lambda) \; . \] This means that we need the likelihood function \(p(y|\lambda)\) and the prior \(p(\lambda)\).

Remember to set N=30.

Likelihood

R

The likelihood is given below (copied from Exercise 1):

loglikelihood.survival <- function(y, N, t, par){

    ## Calculate survival probabilities at measurement points
    ## and add zero "at time-point infinity":
    S <- exp(-par*c(0,t))
    S <- c(S,0)

    ## Calcute probabilities to die within observation windows:
    p <- -diff(S)

    ## Add number of survivors at the end of the experiment:
    y <- c(y, N-sum(y))

    ## Calculate log-likelihood of multinomial distribution at point y:
    LL <- dmultinom(y, prob=p, log=TRUE)

    return(LL)
}
Julia

The likelihood can be implemented like this, copied from Exercise 1.

using Distributions

function loglikelihood_survival(N::Int, t::Vector, y::Vector, λ)
    S = exp.(-λ .* t)
    S = [1; S; 0]
    p = -diff(S)
    ## Add number of survivors at the end of the experiment:
    y = [y; N-sum(y)]

    logpdf(Multinomial(N, p), y)
end

Note, that we return -Inf if a negative parameter is given. However, we still need a prior distribution for the model parameter \(\lambda\) that encodes that it must be positive.

Python

The likelihood can be implemented like this, copied from Exercise 1.

def loglikelihood_survival(y, N, t, par):
    # Calculate survival probabilities at measurement points
    # and add zero "at time-point infinity":
    S = np.exp(-par * np.concatenate(([0], t)))
    S = np.concatenate((S, [0]))

    # Calculate probabilities to die within observation windows:
    p = -np.diff(S)

    # Add number of survivors at the end of the experiment:
    y = np.concatenate((y, [N - np.sum(y)]))

    # Calculate log-likelihood of multinomial distribution at point y:
    LL = multinomial.logpmf(y, n=N, p=p)

    return LL

Prior

The parameter \(\lambda\) must be positive. Furthermore, we know that values around 0.3 are quite plausible, and that values below 0.1 and above 0.7 are pretty unlikely. Which distribution would you choose for the prior?

Implement a function that returns the logarithm of a probability density of the prior distribution for \(\lambda\).

R

You can use a lognormal prior distribution by understanding and filling-in the block below.

# we use a lognormal distribution, which naturally accounts
# for the fact that lambda is positive

logprior.survival <- function(par){

  # a mean and standard deviation that approximate the prior information given above are:
  mu <- ???    # since the distribution is skewed, choose it higher than 0.3 (the mode)
  sigma <- ??? # a value similar to mu will do

  # calculate the mean and standard deviation in the log-space
  meanlog <- log(mu) - 1/2*log(1+(sigma/mu)^2) #
  sdlog <- sqrt(log(1+(sigma/mu)^2))           #

  # Use these to parameterize the prior
  dlnorm(???)
}

Julia

You can use a lognormal prior distribution by understanding and filling-in the block below.

function logprior_survival(λ)
    # a mean and standard deviation that approximate the prior information given above are:
    m = ?       # the distribution is skewed, choose a mean higher than 0.3 (the mode)
    sd = ?      # a value similar to the mean will do

    # calculate the mean and standard deviation in the log-space
    μ = log(m/sqrt(1+sd^2/m^2))
    σ = sqrt(log(1+sd^2/m^2))

    # Use these to parameterize the prior
    return logpdf(Normal(???), ???)
end

Python

You can use a lognormal prior distribution by understanding and filling-in the block below.

def logprior_survival(par):
    # Define mean and standard deviation that approximate the prior information given
    mu = 0.6    # Since the distribution is skewed, choose it higher than 0.3 (the mode)
    sigma = 0.7 # A value similar to mu will do

    # Calculate the mean and standard deviation in the log-space
    meanlog = np.log(mu) - 0.5 * np.log(1 + (sigma / mu)**2)
    sdlog = np.sqrt(np.log(1 + (sigma / mu)**2))

    # Calculate the log prior using log-normal distribution
    log_prior = lognorm.logpdf(par, s=sdlog, scale=np.exp(meanlog))

    return log_prior

Posterior

Finally, define a function that evaluates the density of the log posterior distribution.

This function should take the data from the file model_survival.scv and a parameter value as inputs.

Check what happens if you pass an “illegal”, i.e. negative parameter value. The function must not crash in this case.

R

log.posterior <- function(par, data) {
    ...
}

Julia

function logposterior_survival(λ, data)
    return ...
end

Python

def logposterior_survival(par, data):
    # Calculate log-likelihood
    log_likelihood = ??
    
    # Calculate log-prior
    log_prior = ??
    
    # Calculate log-posterior
    log_posterior = log_likelihood + log_prior
    
    return log_posterior

Code your own Markov Chain Metropolis sampler

Implement your own elementary Markov Chain Metropolis sampler and run it with the posterior defined above for a few thousand iterations. You can find a description of the steps needed in Carlo’s slides.

R

#  set sample size (make it feasible)
sampsize <- 5000

# Create an empty matrix to hold the chain:
chain <- matrix(nrow=sampsize, ncol=2)
colnames(chain) <- c("lambda","logposterior")

# Set start value for your parameter:
par.start <- ???

# compute logosterior value of of your start parameter
chain[1,] <- ???

# Run the MCMC:
for (i in 2:sampsize){

  # Propose new parameter value based on previous one
  # (choose a propsal distribution):
  par.prop <- ???

  # Calculate logposterior of the proposed parameter value:
  logpost.prop <- ???

  # Calculate acceptance probability by using the Metropolis criterion min(1, ... ):
  acc.prob <- ???

  # Store in chain[i,] the new parameter value if accepted, or re-use the old one if rejected:
  if (runif(1) < acc.prob){
    ???
   } else {
    ????
   }
}

Julia

# set your sampsize
sampsize = 5000

# create empty vectors to hold the chain and the log posteriors values
chain = Vector{Float64}(undef, sampsize);
log_posts = Vector{Float64}(undef, sampsize);

# Set start value for your parameter
par_start = ???

# compute logosterior value of of your start parameter
chain[1] = ???
log_posts[1] = ???

# Run the MCMC:
for i in 2:sampsize

  # Propose new parameter value based on previous one (choose a propsal distribution):
  par_prop = ???

  # Calculate logposterior of the proposed parameter value:
  logpost_prop = ???

  # Calculate acceptance probability by using the Metropolis criterion min(1, ... ):
  acc_prob = ???

  # Store in chain[i] the new parameter value if accepted,
  # or re-store old one if rejected:
  if rand() < acc_prob # rand() is the same as rand(Uniform(0,1))
    ???
  else # if not accepted, stay at the current parameter value
    ???
  end
end

Python

# Set sample size (make it feasible)
sampsize = 5000

# Create an empty array to hold the chain
chain = np.zeros((sampsize, 2))
# Define column names
column_names = ['lambda', 'logposterior']

# Set start value for your parameter
par_start = {'lambda': 0.5}

# Compute posterior value of the first chain using the start value
# of the parameter and the logprior and loglikelihood functions already defined
chain[0, 0] = par_start['lambda']
chain[0, 1] = logposterior_survival(par_start['lambda'], data_survival)

# Run the MCMC
for i in range(1, sampsize):
    # Propose a new parameter value based on the previous one (think of proposal distribution)
    par_prop = ??

    # Calculate log-posterior of the proposed parameter value (use likelihood and prior)
    logpost_prop = ??

    # Calculate acceptance probability
    acc_prob = ??

    # Store in chain[i, :] the new parameter value if accepted, or re-use the old one if rejected
    if np.random.rand() < acc_prob:
        chain[i, :] = [par_prop, logpost_prop]
    else:  # if not accepted, stay at the current parameter value
        chain[i, :] = chain[i - 1, :]

# If you want to convert the array to a DataFrame for further analysis
df_chain = pd.DataFrame(chain, columns=column_names)

Experiment with you sampler: - Plot the resulting chain. Does it look reasonable? - Test the effect of different jump distribution. - Find the value of \(\lambda\) within your sample, for which the posterior is maximal. - Create a histogram or density plot of the posterior sample. Look at the chain plots to decide how many the burn-in sample you need to remove.

Solution

R

Defining the likelihood, prior, and posterior

loglikelihood.survival <- function(y, N, t, par){

    ## Calculate survival probabilities at measurement points
    ## and add zero "at time-point infinity":
    S <- exp(-par*c(0,t))
    S <- c(S,0)

    ## Calcute probabilities to die within observation windows:
    p <- -diff(S)

    ## Add number of survivors at the end of the experiment:
    y <- c(y, N-sum(y))

    ## Calculate log-likelihood of multinomial distribution at point y:
    LL <- dmultinom(y, prob=p, log=TRUE)

    return(LL)
}

Using a log normal distribution as prior

logprior.survival <- function(par){

  # a mean and standard deviation that approximate the prior information given above are:
  mu <- 0.6    # since the distribution is skewed, choose it higher than 0.3 (the mode)
  sigma <- 0.7 # a value similar to mu will do

  # calculate the mean and standard deviation in the log-space
  meanlog <- log(mu) - 1/2*log(1+(sigma/mu)^2) #
  sdlog <- sqrt(log(1+(sigma/mu)^2))           #

  # Use these to parameterize the prior
  dlnorm(par, meanlog=meanlog, sdlog=sdlog, log=TRUE)
}

We also define a function for the log posterior

logposterior.survival <- function(par, data) {
    loglikelihood.survival(data$deaths, N=30, data$time, par) + logprior.survival(par)
}

Code your own Markov Chain Metropolis sampler

# Set sample size (make it feasible)
sampsize <- 5000

# Create an empty matrixto hold the chain:
chain <- matrix(nrow=sampsize, ncol=2)
colnames(chain) <- c("lambda","logposterior")

# Set start value for your parameter:
par.start <- c("lambda"=0.5)

# Compute posterior value of first chain by using your start value
# of the parameter, and the logprior and loglikelihood already defined
chain[1,] <- c(par.start,
               logposterior.survival(par.start, data.survival))

# Run the MCMC:
for (i in 2:sampsize){

  # Propose new parameter value based on previous one (think of propsal distribution):
  par.prop <- chain[i-1,1] + rnorm(1, mean=0, sd=0.01)

  # Calculate logposterior of the proposed parameter value (use likelihood and prior):
  logpost.prop <- logposterior.survival(par.prop, data.survival)

  # Calculate acceptance probability:
  acc.prob <- min( 1, exp(logpost.prop - chain[i-1,2]))

  # Store in chain[i,] the new parameter value if accepted, or re-use the old one if rejected:
  if (runif(1) < acc.prob){
    chain[i,] <- c(par.prop, logpost.prop)
  } else {  # if not accepted, stay at the current parameter value
    chain[i,] <-  chain[i-1,]
  }
}

Plot the resulting chain. Does it look reasonable? Have you selected an appropriate jump distribution?

plot(chain[,"lambda"], ylab=expression(lambda), pch=19, cex=0.25, col='black')

Find the value of \(\lambda\) within your sample, for which the posterior is maximal.

# Find the parameter value corresponding to the maximum posterior density
chain[which.max(chain[,2]),]
##       lambda logposterior 
##    0.1917226   -9.7517009

Create a density plot of the posterior sample using plot(density()), but remove the burn-in first. Inspect the chain in the figure above: how many samples would you discard as burn-in?

# Create a density plot of the posterior sample using `plot(density())` without burn-in
plot(density(chain[1000:nrow(chain),1]), xlab=expression(lambda), main="")

Julia

Defining the likelihood, prior, and posterior

using Distributions

function loglikelihood_survival(N::Int, t::Vector, y::Vector, λ)
    S = exp.(-λ .* t)
    S = [1; S; 0]
    p = -diff(S)
    ## Add number of survivors at the end of the experiment:
    y = [y; N-sum(y)]

    logpdf(Multinomial(N, p), y)
end
## loglikelihood_survival (generic function with 1 method)

function logprior_survival(λ)
    # a mean and standard deviation that approximate the prior information given above:
    m = 0.5     # since the distribution is skewed, choose it higher than 0.3 (the mode)
    sd = 0.5    # a value similar to the mean will do

    # calculate the mean and standard deviation in the log-space
    μ = log(m/sqrt(1+sd^2/m^2))
    σ = sqrt(log(1+sd^2/m^2))

    # Use these to parameterize the prior
    return logpdf(LogNormal(μ, σ), λ)
end
## logprior_survival (generic function with 1 method)

Note, we check if a parameter value is possible before calling the likelihood function:

function logposterior_survival(λ, data)
    lp = logprior_survival(λ)
    if !isinf(lp)
        lp += loglikelihood_survival(30, data.time, data.deaths, λ)
    end
    lp
end
## logposterior_survival (generic function with 1 method)

Code your own Markov Chain Metropolis sampler

# set your sampsize
sampsize = 5000

# create empty vectors to hold the chain and the log posteriors values
chain = Vector{Float64}(undef, sampsize);
log_posts = Vector{Float64}(undef, sampsize);

# Set start value for your parameter
par_start = 0.5

# compute logposterior value of of your start parameter
chain[1] = par_start
log_posts[1] = logposterior_survival(par_start, survival_data)

# Run the MCMC:
for i in 2:sampsize
    # Propose new parameter value based on previous one (choose a propsal distribution):
    par_prop = chain[i-1] + rand(Normal(0, 0.01))

    # Calculate logposterior of the proposed parameter value:
    logpost_prop = logposterior_survival(par_prop, survival_data)

    # Calculate acceptance probability by using the Metropolis criterion min(1, ... ):
    acc_prob = min(1, exp(logpost_prop - log_posts[i-1]))

    # Store in chain[i] the new parameter value if accepted, or re-store old one if rejected:
    if rand() < acc_prob # rand() is the same as rand(Uniform(0,1))
        chain[i] = par_prop
        log_posts[i] = logpost_prop
    else # if not accepted, stay at the current parameter value
        chain[i] = chain[i-1]
        log_posts[i] = log_posts[i-1]
    end
end

The so called trace-plot can give us an idea how well the chain has converged:

plot(chain, label=false, xlab="iteration", ylab="λ")

Find the parameter value corresponding to the maximum posterior density

chain[argmax(log_posts)]
## 0.1922136100380536
log_posts[argmax(log_posts)]
## -9.619981377536545

A histogram of the posterior. Note, we removed the first 1000 iterations as burn-in:

histogram(chain[1000:end], xlab="λ", label=false)

Python

Defining the likelihood, prior, and posterior

The following libraries are required:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import multinomial, gaussian_kde, lognorm, norm
# Loading data
# Specify the path to the CSV file
file_path = r"../../data/model_survival.csv"

# Load the CSV file into a pandas DataFrame
data_survival = pd.read_csv(file_path, sep=" ")

print(data_survival)
##    time  deaths
## 0     1       2
## 1     2       7
## 2     3       4
## 3     4       5
## 4     5       1
def loglikelihood_survival(y, N, t, par):
    # Calculate survival probabilities at measurement points
    # and add zero "at time-point infinity":
    S = np.exp(-par * np.concatenate(([0], t)))
    S = np.concatenate((S, [0]))

    # Calculate probabilities to die within observation windows:
    p = -np.diff(S)

    # Add number of survivors at the end of the experiment:
    y = np.concatenate((y, [N - np.sum(y)]))

    # Calculate log-likelihood of multinomial distribution at point y:
    LL = multinomial.logpmf(y, n=N, p=p)

    return LL

Using a log normal distribution as prior

def logprior_survival(par):
    # Define mean and standard deviation that approximate the prior information given
    mu = 0.6    # Since the distribution is skewed, choose it higher than 0.3 (the mode)
    sigma = 0.7 # A value similar to mu will do

    # Calculate the mean and standard deviation in the log-space
    meanlog = np.log(mu) - 0.5 * np.log(1 + (sigma / mu)**2)
    sdlog = np.sqrt(np.log(1 + (sigma / mu)**2))

    # Calculate the log prior using log-normal distribution
    log_prior = lognorm.logpdf(par, s=sdlog, scale=np.exp(meanlog))

    return log_prior

We also define a function for the log posterior

def logposterior_survival(par, data):
    # Calculate log-likelihood
    log_likelihood = loglikelihood_survival(data['deaths'], N=30, t=data['time'], par=par)
    
    # Calculate log-prior
    log_prior = logprior_survival(par)
    
    # Calculate log-posterior
    log_posterior = log_likelihood + log_prior
    
    return log_posterior

Code your own Markov Chain Metropolis sampler

# Set sample size (make it feasible)
sampsize = 5000

# Create an empty array to hold the chain
chain = np.zeros((sampsize, 2))
# Define column names
column_names = ['lambda', 'logposterior']

# Set start value for your parameter
par_start = {'lambda': 0.5}

# Compute posterior value of the first chain using the start value
# of the parameter and the logprior and loglikelihood functions already defined
chain[0, 0] = par_start['lambda']
chain[0, 1] = logposterior_survival(par_start['lambda'], data_survival)

# Run the MCMC
for i in range(1, sampsize):
    # Propose a new parameter value based on the previous one (think of proposal distribution)
    par_prop = chain[i - 1, 0] + np.random.normal(0, 0.01)

    # Calculate log-posterior of the proposed parameter value (use likelihood and prior)
    logpost_prop = logposterior_survival(par_prop, data_survival)

    # Calculate acceptance probability
    acc_prob = min(1, np.exp(logpost_prop - chain[i - 1, 1]))

    # Store in chain[i, :] the new parameter value if accepted, or re-use the old one if rejected
    if np.random.rand() < acc_prob:
        chain[i, :] = [par_prop, logpost_prop]
    else:  # if not accepted, stay at the current parameter value
        chain[i, :] = chain[i - 1, :]

# If you want to convert the array to a DataFrame for further analysis
df_chain = pd.DataFrame(chain, columns=column_names)

# Now `df_chain` contains the MCMC chain data in a pandas DataFrame

Plot the resulting chain. Does it look reasonable? Have you selected an appropriate jump distribution?

# Extract lambda values from the chain
lambda_values = chain[:, 0]  # Assuming lambda values are in the first column

# Create the scatter plot
plt.figure(figsize=(10, 6))
plt.plot(lambda_values, 'o', markersize=2, color='black')

# Set y-axis label with Greek letter lambda (λ)
plt.ylabel(r'$\lambda$')

# Set x-axis label and title (optional)
plt.xlabel('Sample Index')

# Display the plot
plt.tight_layout()
plt.show()

Find the value of \(\lambda\) within your sample, for which the posterior is maximal.

# Find the index of the maximum log posterior density
max_index = np.argmax(chain[:, 1])

# Retrieve the parameter value corresponding to the maximum posterior density
max_posterior_value = chain[max_index, :]

# Print the result
print(f"Lambda: {max_posterior_value[0]}, Log posterior: {max_posterior_value[1]}")
## Lambda: 0.19174332253190032, Log posterior: -9.751700873138514

Create a density plot of the posterior sample using plot(density()), but remove the burn-in first. Inspect the chain in the figure above: how many samples would you discard as burn-in?

# Define the portion of the chain to use (without burn-in)
chain_without_burn_in = chain[1000:, 0]  # Considering only the first column of the chain

# Create a kernel density estimation of the posterior sample
kde = gaussian_kde(chain_without_burn_in)

# Generate a range of x values over which to evaluate the density estimate
x = np.linspace(min(chain_without_burn_in), max(chain_without_burn_in), 1000)

# Plot the density estimate
plt.figure(figsize=(10, 6))
plt.plot(x, kde(x), label='Density')

# Set x-axis label with Greek letter lambda (λ)
plt.xlabel(r'$\lambda$')

# Title is set as an empty string to match the provided R code
plt.title("")

# Show the plot
plt.tight_layout()
plt.show()

2. Sample from the posterior for the monod model ★

The main difference of the “Monod” compared to the “Survival” model is that it has more than one parameter that we would like to infer.

You can either generalize our own Metropolis sampler to higher dimensions or use a sampler from a package.

Read data

We use the data model_monod_stoch.csv:

data.monod <- read.table("../../data/model_monod_stoch.csv", header=T)
plot(data.monod$C, data.monod$r, xlab="C", ylab="r", main='"model_monod_stoch.csv"')

Define likelihood, prior, and posterior

Again, we need to prepare a function that evaluates the log posterior. You can use the templates below.

R

# Logprior for model "monod": lognormal distribution

prior.monod.mean <- 0.5*c(r_max=5, K=3, sigma=0.5)
prior.monod.sd   <- 0.5*prior.monod.mean

logprior.monod <- function(par,mean,sd){

    sdlog    <- ???
    meanlog  <- ???

    return(sum(???)) # we work log-space, hence sum over multiple independent data points
}

# Log-likelihood for model "monod"

loglikelihood.monod <- function( y, C, par){

  # deterministic part:
  y.det <- ???

  # Calculate loglikelihood:
  return( sum(???) )
}

# Log-posterior for model "monod" given data 'data.monod' with layout 'L.monod.obs'

logposterior.monod <- function(par) {
  logpri <- logprior.monod(???)
  if(logpri==-Inf){
    return(-Inf)
  } else {
    return(???) # sum log-prior and log-likelihood
  }
}

Julia

include("../../models/models.jl") # load monod model
using ComponentArrays
using Distributions

# set parameters
par = ComponentVector(r_max = 5, K=3, sigma=0.5)

prior_monod_mean = 0.5 .* par
prior_monod_sd = 0.25 .* par

# Use a lognormal distribution for all model parameters
function logprior_monod(par, m, sd)
    m = ...
    σ = ...
    return sum(...) # sum because we are in the log-space
end


# Log-likelihood for model "monod"
function loglikelihood_monod(par::ComponentVector, data::DataFrame)
    y_det = ...
    return sum(...)
end

# Log-posterior for model "monod"
function logposterior_monod(par::ComponentVector)
    logpri = logprior_monod(...)
    if logpri == -Inf
        return ...
    else
        return ...    # sum log-prior and log-likelihood
    end
end

Python

# Define the prior mean and standard deviation
prior_monod_mean = 0.5 * np.array([5, 3, 0.5])  # r_max, K, sigma
prior_monod_sd = 0.5 * prior_monod_mean

# Log prior for Monod model
def logprior_monod(par, mean, sd):
    """
    Log-prior function for the Monod model.

    Parameters:
    - par: array-like, the parameters to evaluate the prior (e.g., ['r_max', 'K', 'sigma'])
    - mean: array-like, the mean values for the parameters
    - sd: array-like, the standard deviation values for the parameters

    Returns:
    - log_prior: the sum of log-prior values calculated from the log-normal distribution
    """
    
    # Calculate sdlog and meanlog for log-normal distribution
    var = ??
    sdlog = ??
    meanlog = ??

    # Calculate the sum of log probabilities from the log-normal distribution
    log_prior = np.sum(??)

    return log_prior

def loglikelihood_monod(y, C, par):
    """
    Log-likelihood function for the Monod model.

    Parameters:
    - y: observed growth rates
    - C: substrate concentrations
    - par: dictionary containing the parameters ('r_max', 'K', 'sigma')

    Returns:
    - log_likelihood: the sum of log-likelihood values
    """
    # Calculate growth rate using Monod model
    y_det = ??
    
    # Calculate log-likelihood assuming a normal distribution
    log_likelihood = np.sum(??)
    
    return log_likelihood

def logposterior_monod(par):
    """
    Log-posterior function for the Monod model.

    Parameters:
    - par: dictionary containing the parameters ('r_max', 'K', 'sigma')

    Returns:
    - log_posterior: the log posterior probability
    """
    # Calculate the log prior
    lp = ??
    
    # If log prior is finite, calculate log likelihood and sum
    if np.isfinite(lp):
        log_posterior = ?? + ??
        return log_posterior
    else:
        # Return negative infinity if log prior is not finite
        return -np.inf

Create initial chain

First run an initial chain, which most likely has not yet a good mixing.

For the jump distribution, assume a diagonal covariance matrix with reasonable values. The diagonal covariance matrices (with zero-valued off-diagonal entries) correspond to independent proposal distribution.

R

The code below uses the function adaptMCMC::MCMC. If the adaptation is set to FALSE, this corresponds to the basic Metropolis sampler. Of course, feel free to use your own implementation instead!

library(adaptMCMC)


## As start values for the Markov chain we can use the mean of the prior
par.start <- c(r_max=2.5, K=1.5, sigma=0.25)


## sample
monod.chain <- adaptMCMC::MCMC(p  = ...,
                               n = ...,
                               init = ...,
                               adapt = FALSE # for this exercise we do not want to use automatic adaptation
                               )
monod.chain.coda <- adaptMCMC::convert.to.coda(monod.chain) # this is useful for plotting

Plot the chain and look at the rejection rate of this chain to gain information if the standard deviation of the jump distribution should be increased (if rejection frequency is too low) or decreased (if it was too high). What do you think?

monod.chain$acceptance.rate

plot(monod.chain.coda)

pairs(modod.chain$samples)

Julia

The package KissMCMC provides a very basic Metropolis sampler. Of course, feel free to use your own implementation instead!

using KissMCMC: metropolis
using Distributions
using LinearAlgebra: diagm

# define a function that generates a proposal given θ:
Σjump = diagm(ones(3))
sample_proposal(θ) = θ .+ rand(MvNormal(zeros(length(θ)), Σjump))

# run sampler
samples, acceptance_ratio, lp = metropolis(logposterior_monod,
                                           sample_proposal,
                                           par;
                                           niter = 10_000,
                                           nburnin = 0);

## for convenience we convert our samplers in a `MCMCChains.Chains`
using MCMCChains
using StatsPlots
chn = Chains(samples, labels(par))

plot(chn)
corner(chn)

Array(chn)                      # convert it to normal Array

Python

The code below uses the module emcee, which provides a mcmc sampler.

import emcee
from emcee.moves import GaussianMove

# Number of dimensions (parameters)
ndim = 3  # For our model, 3 dimensions for r_max, K, sigma

# Create a diagonal covariance matrix 
jump_cov = ??

# Create an instance of GaussianMove using the jump covariance matrix
move = GaussianMove(jump_cov)

# Define the initial parameter values (mean of the prior)
par_init = np.array([2.5, 1.5, 0.25])  # r_max, K, sigma

# Define the number of samples
n_samples = ??

# Define the number of dimensions (parameters)
ndim = len(par_init)

# Define the number of walkers for the MCMC chain
nwalkers = 10 * ndim  # A common rule of thumb is 10 times the number of dimensions

# Initialize the walkers around the initial parameter values with a small random perturbation
init_pos = par_init + 1e-4 * np.random.randn(nwalkers, ndim)

# Initialize the `emcee` sampler
sampler = emcee.EnsembleSampler(nwalkers, ndim, logposterior_monod, moves=move)

# Run the MCMC chain
sampler.run_mcmc(init_pos, n_samples, progress=False) 
# If set to True, the progress bar gives a overview of the processing time

Check out the results:

# Retrieve the acceptance rate
print("Mean acceptance fraction: {:.3f}".format(np.mean(sampler.acceptance_fraction)))

# Have an overview of the results
# Reshape the chain_samples to 2D by combining chain and iteration dimensions
# Combine the second and first dimensions (walker and iteration) into a single dimension
# By computing the mean of the walkers
reshaped_chain = np.mean(chain_samples, axis=1)

# Or by selecting one of them
reshaped_chain = chain_samples[:, 0, :]

# Convert reshaped chain samples to pandas DataFrame
chain_df = pd.DataFrame(reshaped_chain, columns=['r_max', 'K', 'sigma'])

# Calculate and display summary statistics (mean, SD, quantiles, etc.)
summary = chain_df.describe(percentiles=[0.025, 0.25, 0.5, 0.75, 0.975])

print("Summary statistics:", summary)

Improve jump distribution

The plot of the previous chain gives us some indication how a more efficient jump distribution could look like.

  • Try to manually change the variance of the jump distribution. Can we get better chains?

  • Use the previous chain and estimate its covariance. Does a correlated jump distribution work better?

R

The function adaptMCMC::MCMC takes argument scale to modify the jump distribution:

scale: vector with the variances _or_ covariance matrix of the jump
          distribution.

Julia

Adapt the covariance matrix Σjump of the jump distribution that is used in the function sample_proposal.

Python

In emcee, there is a default move type used in the EnsembleSampler so called stretch move (StretchMove). This move is well-suited for a wide range of models and performs efficiently in high-dimensional parameter spaces. The stretch move works by randomly choosing a pair of walkers and proposing a new position for one walker based on a stretch factor relative to the other walker’s position.

Residual Diagnostics

The assumptions about the observation noise should be validated. Often we can do this by looking at the model residuals, that is the difference between observations and model prediction.

  • Find the parameters in the posterior sample that correspond to the largest posterior value (the so called maximum posterior estimate, MAP).

  • Run the deterministic “Monod” model with this parameters and analyze the residuals.

    • Are they normally distributed?
    • Are they independent
    • Are they centered around zero?
    • Do you see any suspicious structure?

R

The function qqnorm and acf can be helpful.

Julia

The function StatsBase.autocor and StatPlots.qqnorm(x) can be helpful.

Python

The function acf from statsmodels.tsa.stattools and the function stats from scipy might be useful.

Solution

R

Read data

data.monod <- read.table("../../data/model_monod_stoch.csv", header=T)
plot(data.monod$C, data.monod$r, xlab="C", ylab="r", main='"model_monod_stoch.csv"')

Define likelihood, prior, and posterior

source("../../models/models.r") # load the monod model

## Logprior for model "monod": lognormal distribution

prior.monod.mean <- 0.5 * c(r_max=5, K=3, sigma=0.5)
prior.monod.sd   <- 0.5 * prior.monod.mean

logprior.monod <- function(par, mean, sd){
    sdlog <- sqrt(log(1+sd*sd/(mean*mean)))
    meanlog <- log(mean) - sdlog*sdlog/2
    return(sum(dlnorm(par, meanlog=meanlog, sdlog=sdlog, log=TRUE)))
}

## Log-likelihood for model "monod"

loglikelihood.monod <- function(y, C, par){

    ## deterministic part:
    y.det <- model.monod(par, C) # defined in `models.r`

    ## Calculate loglikelihood assuming independence:
    return( sum(dnorm(y, mean=y.det, sd=par['sigma'], log=TRUE )) )
}

## Log-posterior for model "monod"

logposterior.monod <- function(par) {
    lp <- logprior.monod(par, prior.monod.mean, prior.monod.sd)
    if(is.finite(lp)){
        return( lp + loglikelihood.monod(data.monod$r, data.monod$C, par) )
    } else {
        return(-Inf)
    }
}

Create initial chain

library(adaptMCMC)

## As start values for the Markov chain we can use the mean of the prior
par.init <- c(r_max=2.5, K=1.5, sigma=0.25)

## sample
monod.chain <- adaptMCMC::MCMC(p = logposterior.monod,
                               n = 5000,
                               init = par.init,
                               scale = c(1,1,1),
                               adapt = FALSE, # for this exercise we do not
                                              # want to use automatic adaptation
                               showProgressBar = FALSE
                               )
##   generate 5000 samples
monod.chain.coda <- adaptMCMC::convert.to.coda(monod.chain) # this is useful for plotting
summary(monod.chain.coda)
## 
## Iterations = 1:5000
## Thinning interval = 1 
## Number of chains = 1 
## Sample size per chain = 5000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##         Mean      SD Naive SE Time-series SE
## r_max 4.2527 0.32411 0.004584        0.06628
## K     1.4403 0.36616 0.005178        0.06634
## sigma 0.4822 0.09289 0.001314        0.01735
## 
## 2. Quantiles for each variable:
## 
##         2.5%    25%    50%    75%  97.5%
## r_max 3.6496 3.9849 4.2649 4.4506 5.0970
## K     0.7260 1.1661 1.3773 1.6447 2.4658
## sigma 0.3779 0.4129 0.4714 0.4995 0.6852

The acceptance rate is very low:

monod.chain$acceptance.rate
## [1] 0.01

Them poor mixing is also visible from the chain plots:

plot(monod.chain.coda)

Here we see a strong correlation between \(K\) and \(r_{max}\):

pairs(monod.chain$samples)

Improve jump distribution

We use the first chain and estimates it’s covariance matrix which we can use as jump distribution in a second run:

Sigma.jump <- cov(monod.chain$samples) * (1/2)^2

monod.chain2 <- adaptMCMC::MCMC(p = logposterior.monod,
                                n = 5000,
                                init = par.init,
                                scale = Sigma.jump,
                                adapt = FALSE,
                                showProgressBar = FALSE
                                )
##   generate 5000 samples
monod.chain.coda2 <- adaptMCMC::convert.to.coda(monod.chain2)
summary(monod.chain.coda2)
## 
## Iterations = 1:5000
## Thinning interval = 1 
## Number of chains = 1 
## Sample size per chain = 5000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##         Mean      SD Naive SE Time-series SE
## r_max 4.4053 0.33911 0.004796       0.023498
## K     1.6809 0.41673 0.005893       0.030887
## sigma 0.4707 0.07129 0.001008       0.003914
## 
## 2. Quantiles for each variable:
## 
##         2.5%    25%    50%    75%  97.5%
## r_max 3.7741 4.1796 4.4056 4.6102 5.0514
## K     0.9795 1.3816 1.6580 1.9444 2.6033
## sigma 0.3507 0.4197 0.4654 0.5148 0.6267

Everything looks much better now!

monod.chain2$acceptance.rate
## [1] 0.518
plot(monod.chain.coda2)

pairs(monod.chain2$samples)

Residual Diagnostics

First we compute the residuals with the ‘best’ parameters

# get maximum of posterior parameters
par.max <- monod.chain2$samples[which.max(monod.chain2$log.p),]

residuals <- data.monod$r - model.monod(par.max, C=data.monod$C)

Plotting the residuals. We hope not to find any structure, but this model is clearly not perfect:

plot(data.monod$r, residuals, xlab="r")
abline(h=0, col=2)

Quantile-quantile plots are great to see if the residuals are normal distributed (which was out assumption for the likelihood functions). Here we see that we have too heavy tails.

qqnorm(residuals)
qqline(residuals)

We also assumed independence of the observations, i.e. we should not see any auto-correlation. This looks good here:

acf(residuals)

Julia

Read data

using DataFrames
import CSV
using Plots

monod_data = CSV.read("../../data/model_monod_stoch.csv", DataFrame)
## 20×2 DataFrame
##  Row │ C        r
##      │ Float64  Float64
## ─────┼──────────────────
##    1 │     0.5  1.24893
##    2 │     1.0  1.69416
##    3 │     1.5  2.54771
##    4 │     2.0  2.07218
##    5 │     2.5  2.67768
##    6 │     3.0  1.32508
##    7 │     3.5  2.74149
##    8 │     4.0  3.31631
##   ⋮  │    ⋮        ⋮
##   14 │     7.0  3.22343
##   15 │     7.5  3.97951
##   16 │     8.0  3.51488
##   17 │     8.5  3.41961
##   18 │     9.0  3.33122
##   19 │     9.5  3.9776
##   20 │    10.0  4.07524
##           5 rows omitted
scatter(monod_data.C, monod_data.r,
        label=false, xlab="C", ylab="r")

Define likelihood, prior, and posterior

include("../../models/models.jl"); # load the definition of `model_monod`
## model_growth
using ComponentArrays

# set parameters
par = ComponentVector(r_max = 5, K=3, sigma=0.5);

prior_monod_mean = 0.5 .* par;
prior_monod_sd = 0.25 .* par;

# Use a lognormal distribution for all model parameters
function logprior_monod(par, m, sd)
    μ = @. log(m/sqrt(1+sd^2/m^2))
    σ = @. sqrt(log(1+sd^2/m^2))
    return sum(logpdf.(LogNormal.(μ, σ), par)) # sum because we are in the log-space
end
## logprior_monod (generic function with 1 method)

# Log-likelihood for model "monod"
function loglikelihood_monod(par::ComponentVector, data::DataFrame)
    y_det = model_monod(data.C, par)
    return sum(logpdf.(Normal.(y_det, par.sigma), data.r))
end
## loglikelihood_monod (generic function with 1 method)

# Log-posterior for model "monod"
function logposterior_monod(par::ComponentVector)
    lp = logprior_monod(par, prior_monod_mean, prior_monod_sd)
    if !isinf(lp)
        lp += loglikelihood_monod(par, monod_data)
    end
    lp
end
## logposterior_monod (generic function with 1 method)

Create initial chain

We use the KissMCMC package which provides a very basic metropolis sampler.

using KissMCMC: metropolis
using Distributions
using LinearAlgebra: diagm
using MCMCChains
using StatsPlots

# define a function that generates a proposal given θ:
Σjump = diagm(ones(3))
## 3×3 Matrix{Float64}:
##  1.0  0.0  0.0
##  0.0  1.0  0.0
##  0.0  0.0  1.0
sample_proposal(θ) = θ .+ rand(MvNormal(zeros(length(θ)), Σjump))
## sample_proposal (generic function with 1 method)
# run sampler
samples, acceptance_ratio, lp = metropolis(logposterior_monod,
                                           sample_proposal,
                                           par;
                                           niter = 5_000,
                                           nburnin = 0);


# We convert the sampeles in a `MCMCChains.Chains` object to
# get plotting and summary functions
chn = Chains(samples, labels(par))
## Chains MCMC chain (5000×3×1 Array{Float64, 3}):
## 
## Iterations        = 1:1:5000
## Number of chains  = 1
## Samples per chain = 5000
## parameters        = r_max, K, sigma
## 
## Summary Statistics
##   parameters      mean       std      mcse   ess_bulk   ess_tail      rhat   e ⋯
##       Symbol   Float64   Float64   Float64    Float64    Float64   Float64     ⋯
## 
##        r_max    4.3298    0.3830    0.0743    27.4031    24.7419    1.0046     ⋯
##            K    1.6396    0.5616    0.1096    25.1816    26.7184    1.0645     ⋯
##        sigma    0.4924    0.0807    0.0132    37.4450    35.4521    1.0588     ⋯
##                                                                 1 column omitted
## 
## Quantiles
##   parameters      2.5%     25.0%     50.0%     75.0%     97.5%
##       Symbol   Float64   Float64   Float64   Float64   Float64
## 
##        r_max    3.7922    3.9585    4.2410    4.5927    5.2362
##            K    0.8287    1.2058    1.5444    2.0297    3.0000
##        sigma    0.3542    0.4499    0.4740    0.5446    0.6739
plot(chn)

corner(chn)

Improve jump distribution

The corner plot showed that the parameters \(k\) and \(r_{max}\) are clearly correlated and the chain plots that the jump distribution is too large.

We use the previous sample to estimate the covariance matrix which is then used as new jump distribution.

# take covariance for previous chain
Σjump = cov(Array(chn)) * (1/2)^2
## 3×3 Matrix{Float64}:
##  0.0366719    0.0490047   0.000346266
##  0.0490047    0.0788504   8.92394e-5
##  0.000346266  8.92394e-5  0.00162757
sample_proposal(θ) = θ .+ rand(MvNormal(zeros(length(θ)), Σjump))
## sample_proposal (generic function with 1 method)
samples2, acceptance_ratio2, lp2 = metropolis(logposterior_monod,
                                              sample_proposal,
                                              par;
                                              niter = 5_000,
                                              nburnin = 0);

chn2 = Chains(samples2, labels(par))
## Chains MCMC chain (5000×3×1 Array{Float64, 3}):
## 
## Iterations        = 1:1:5000
## Number of chains  = 1
## Samples per chain = 5000
## parameters        = r_max, K, sigma
## 
## Summary Statistics
##   parameters      mean       std      mcse   ess_bulk   ess_tail      rhat   e ⋯
##       Symbol   Float64   Float64   Float64    Float64    Float64   Float64     ⋯
## 
##        r_max    4.4159    0.3241    0.0214   232.0111   337.8147    1.0081     ⋯
##            K    1.6709    0.4512    0.0278   274.5231   405.6778    1.0091     ⋯
##        sigma    0.4667    0.0730    0.0051   216.5326   348.4197    1.0006     ⋯
##                                                                 1 column omitted
## 
## Quantiles
##   parameters      2.5%     25.0%     50.0%     75.0%     97.5%
##       Symbol   Float64   Float64   Float64   Float64   Float64
## 
##        r_max    3.8432    4.1957    4.3995    4.6120    5.1464
##            K    0.9419    1.3478    1.6140    1.9517    2.6936
##        sigma    0.3433    0.4144    0.4607    0.5113    0.6238
plot(chn2)

corner(chn2)

Much better!

Residual Diagnostics

First we compute the residuals with the ‘best’ parameters

# get maximum of posterior parameters
par_max = samples2[argmax(lp2)]

# compute residuals
residuals = monod_data.r .- model_monod(monod_data.C, par_max)

Plotting the residuals. We hope not to find any structure, but this model is clearly not perfect:

p = scatter(monod_data.r, residuals, label=false, xlab="r", ylab="residual");
hline!(p, [0], label=false)

Quantile-quantile plots are great to see if the residuals are normal distributed (which was out assumption for the likelihood functions). Here we see that we have too heavy tails.

qqnorm(residuals, qqline = :R)               # a bit heavy-tailed

We also assumed independence of the observations, i.e. we should not see any auto-correlation. This looks good here:

using StatsBase
acf = StatsBase.autocor(residuals);
bar(0:(length(acf)-1), acf, label=false, xlab="lag", ylab="correlation")

Python

Read data

from models import model_monod, model_growth
import emcee
from emcee.moves import GaussianMove
import seaborn as sns
# Loading data
# Specify the path to the CSV file
file_path = r"../../data/model_monod_stoch.csv"

# Load the CSV file into a pandas DataFrame
data_monod = pd.read_csv(file_path, sep=" ")

# Assuming `data_monod` is a pandas DataFrame that contains columns 'C' and 'r'
# Plot 'C' against 'r'
plt.figure(figsize=(10, 6))
plt.scatter(data_monod['C'], data_monod['r'])

# Set x-axis label
plt.xlabel("C")

# Set y-axis label
plt.ylabel("r")

# Set title
plt.title('"model_monod_stoch.csv"')

# Display the plot
plt.show()

Define likelihood, prior, and posterior

# Define the prior mean and standard deviation
prior_monod_mean = 0.5 * np.array([5, 3, 0.5])  # r_max, K, sigma
prior_monod_sd = 0.5 * prior_monod_mean

# Log prior for Monod model
def logprior_monod(par, mean, sd):
    """
    Log-prior function for the Monod model.

    Parameters:
    - par: array-like, the parameters to evaluate the prior (e.g., ['r_max', 'K', 'sigma'])
    - mean: array-like, the mean values for the parameters
    - sd: array-like, the standard deviation values for the parameters

    Returns:
    - log_prior: the sum of log-prior values calculated from the log-normal distribution
    """
    
    # Calculate sdlog and meanlog for log-normal distribution
    var = (sd**2) / (mean**2)
    sdlog = np.sqrt(np.log(1 + var))
    meanlog = np.log(mean) - 0.5 * sdlog**2

    # Calculate the sum of log probabilities from the log-normal distribution
    log_prior = np.sum(lognorm.logpdf(par, s=sdlog, scale=np.exp(meanlog)))

    return log_prior

def loglikelihood_monod(y, C, par):
    """
    Log-likelihood function for the Monod model.

    Parameters:
    - y: observed growth rates
    - C: substrate concentrations
    - par: dictionary containing the parameters ('r_max', 'K', 'sigma')

    Returns:
    - log_likelihood: the sum of log-likelihood values
    """
    # Calculate growth rate using Monod model
    y_det = model_monod(par, C)
    
    # Calculate log-likelihood assuming a normal distribution
    log_likelihood = np.sum(norm.logpdf(y, loc=y_det, scale=par[2]))
    
    return log_likelihood


def logposterior_monod(par):
    """
    Log-posterior function for the Monod model.

    Parameters:
    - par: dictionary containing the parameters ('r_max', 'K', 'sigma')

    Returns:
    - log_posterior: the log posterior probability
    """
    # Calculate the log prior
    lp = logprior_monod(par, prior_monod_mean, prior_monod_sd)
    
    # If log prior is finite, calculate log likelihood and sum
    if np.isfinite(lp):
        log_posterior = lp + loglikelihood_monod(data_monod['r'], data_monod['C'], par)
        return log_posterior
    else:
        # Return negative infinity if log prior is not finite
        return -np.inf

# Make sure you have `data_monod` as a pandas DataFrame with columns 'r' and 'C'.

Create initial chain

# Number of dimensions (parameters)
ndim = 3  # For your example, 3 dimensions for r_max, K, sigma

# Create a diagonal covariance matrix with each diagonal entry equal to 1
jump_cov = np.diag([1.0] * ndim)

# Create an instance of GaussianMove using the jump covariance matrix
move = GaussianMove(jump_cov)

# Define the initial parameter values (mean of the prior)
par_init = np.array([2.5, 1.5, 0.25])  # r_max, K, sigma

# Define the number of samples
n_samples = 5000

# Define the number of dimensions (parameters)
ndim = len(par_init)

# Define the number of walkers for the MCMC chain
nwalkers = 10 * ndim  # A common rule of thumb is 10 times the number of dimensions

# Initialize the walkers around the initial parameter values with a small random perturbation
init_pos = par_init + 1e-4 * np.random.randn(nwalkers, ndim)

# Initialize the `emcee` sampler
sampler = emcee.EnsembleSampler(nwalkers, ndim, logposterior_monod, moves=move)

# Run the MCMC chain
sampler.run_mcmc(init_pos, n_samples, progress=False)

# Get the chain samples
chain_samples = sampler.get_chain()
# Display a summary of the chain
print("Chain shape: ", chain_samples.shape)
## Chain shape:  (5000, 30, 3)
# Reshape the chain_samples to 2D by combining chain and iteration dimensions
# Combine the second and first dimensions (walker and iteration) into a single dimension
reshaped_chain = np.mean(chain_samples, axis=1)

# Convert reshaped chain samples to pandas DataFrame
chain_df = pd.DataFrame(reshaped_chain, columns=['r_max', 'K', 'sigma'])

# Calculate and display summary statistics (mean, SD, quantiles, etc.)
summary = chain_df.describe(percentiles=[0.025, 0.25, 0.5, 0.75, 0.975])

print("Summary statistics:")
## Summary statistics:
print(summary)
##              r_max            K        sigma
## count  5000.000000  5000.000000  5000.000000
## mean      4.357326     1.555291     0.468358
## std       0.154491     0.209188     0.038053
## min       2.499990     0.247612     0.250014
## 2.5%      4.106334     1.249781     0.396576
## 25%       4.243117     1.412521     0.443242
## 50%       4.387100     1.526042     0.465294
## 75%       4.452270     1.691225     0.494047
## 97.5%     4.629797     1.976964     0.536025
## max       4.778828     2.101145     1.263637

The acceptance rate is very low:

print("Mean acceptance fraction: {:.3f}".format(np.mean(sampler.acceptance_fraction)))
## Mean acceptance fraction: 0.011

The poor mixing is also visible from the chain plots:

# Assuming `reshaped_chain` is the reshaped 2D array from the MCMC chain
# Each column represents a parameter and each row represents an iteration

# List of parameter names
param_names = ['r_max', 'K', 'sigma']

# Number of parameters (for each variable)
n_params = len(param_names)

# Create a 3x2 grid of subplots
fig, axs = plt.subplots(n_params, 2, figsize=(12, 8))

# Iterate through each parameter
for i in range(n_params):
    # Plot trace on the left subplot (left column)
    axs[i, 0].plot(reshaped_chain[:, i])
    axs[i, 0].set_ylabel(param_names[i])
    axs[i, 0].set_xlabel('Iteration')
    axs[i, 0].set_title(f'Trace of {param_names[i]}')
    axs[i, 0].grid(True)
    
    # Plot KDE (kernel density estimation) on the right subplot (right column)
    sns.kdeplot(reshaped_chain[:, i], ax=axs[i, 1], color='blue')
    axs[i, 1].set_title(f'Density of {param_names[i]}')
    axs[i, 1].grid(True)
    axs[i, 1].set_xlabel(param_names[i])

# Adjust layout and display the plots
plt.tight_layout()
plt.show()

Here we see a strong correlation between \(K\) and \(r_{max}\):

# Define the parameter names (use appropriate names for your parameters)
param_names = ['r_max', 'K', 'sigma']

# Convert the reshaped_chain array to a pandas DataFrame with appropriate column names
df = pd.DataFrame(reshaped_chain, columns=param_names)

# Create pairwise scatter plots between the variables using seaborn's pairplot
pairplot = sns.pairplot(df)
## C:\Users\nascimth\AppData\Local\R-MINI~1\envs\SUMMER~1\lib\site-packages\seaborn\_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
##   with pd.option_context('mode.use_inf_as_na', True):
## C:\Users\nascimth\AppData\Local\R-MINI~1\envs\SUMMER~1\lib\site-packages\seaborn\_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
##   with pd.option_context('mode.use_inf_as_na', True):
## C:\Users\nascimth\AppData\Local\R-MINI~1\envs\SUMMER~1\lib\site-packages\seaborn\_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
##   with pd.option_context('mode.use_inf_as_na', True):
# Manually adjust the layout if needed
pairplot.fig.tight_layout()

# Show the plots
plt.show()

Improve jump distribution

from emcee.moves import StretchMove

# Define the initial parameter values (mean of the prior)
par_init = np.array([2.5, 1.5, 0.25])  # r_max, K, sigma

# Define the number of samples
n_samples = 5000

# Define the number of dimensions (parameters)
ndim = len(par_init)

# Create an instance of the StretchMove
move = StretchMove()

# Define the number of walkers for the MCMC chain
nwalkers = 10 * ndim  # A common rule of thumb is 10 times the number of dimensions

# Initialize the walkers around the initial parameter values with a small random perturbation
init_pos = par_init + 1e-4 * np.random.randn(nwalkers, ndim)

# Initialize the `emcee` sampler
sampler = emcee.EnsembleSampler(nwalkers, ndim, logposterior_monod, moves=move)

# Run the MCMC chain
sampler.run_mcmc(init_pos, n_samples, progress=False)

# Get the chain samples
chain_samples = sampler.get_chain()

Everything looks much better now!

print("Mean acceptance fraction: {:.3f}".format(np.mean(sampler.acceptance_fraction)))
## Mean acceptance fraction: 0.637
# Reshape the chain_samples to 2D by combining chain and iteration dimensions
# Combine the second and first dimensions (walker and iteration) into a single dimension
reshaped_chain = np.mean(chain_samples, axis=1)

# Convert reshaped chain samples to pandas DataFrame
chain_df = pd.DataFrame(reshaped_chain, columns=['r_max', 'K', 'sigma'])

# Calculate and display summary statistics (mean, SD, quantiles, etc.)
summary = chain_df.describe(percentiles=[0.025, 0.25, 0.5, 0.75, 0.975])

print("Summary statistics:")
## Summary statistics:
print(summary)
##              r_max            K        sigma
## count  5000.000000  5000.000000  5000.000000
## mean      4.399105     1.672738     0.472735
## std       0.215795     0.084666     0.039617
## min       2.338839     1.411046     0.249992
## 2.5%      4.311733     1.514861     0.445548
## 25%       4.380889     1.617849     0.461406
## 50%       4.419820     1.666773     0.470559
## 75%       4.458817     1.722637     0.480598
## 97.5%     4.536127     1.849333     0.500528
## max       4.603911     2.152232     0.966680
# Define the parameter names (use appropriate names for your parameters)
param_names = ['r_max', 'K', 'sigma']

# Convert the reshaped_chain array to a pandas DataFrame with appropriate column names
df = pd.DataFrame(reshaped_chain, columns=param_names)

# Create pairwise scatter plots between the variables using seaborn's pairplot
pairplot = sns.pairplot(df)
## C:\Users\nascimth\AppData\Local\R-MINI~1\envs\SUMMER~1\lib\site-packages\seaborn\_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
##   with pd.option_context('mode.use_inf_as_na', True):
## C:\Users\nascimth\AppData\Local\R-MINI~1\envs\SUMMER~1\lib\site-packages\seaborn\_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
##   with pd.option_context('mode.use_inf_as_na', True):
## C:\Users\nascimth\AppData\Local\R-MINI~1\envs\SUMMER~1\lib\site-packages\seaborn\_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
##   with pd.option_context('mode.use_inf_as_na', True):
# Manually adjust the layout if needed
pairplot.fig.tight_layout()

# Show the plots
plt.show()

Residual Diagnostics

First we compute the residuals with the ‘best’ parameters

from scipy import stats

#Get the log-posterior values for each sample in the chain
log_posterior_values = sampler.get_log_prob(flat=False)

# Select one walker
log_posterior_values=log_posterior_values[:, 0]

# Get the index of the maximum log-posterior probability
max_log_p_index = np.argmax(log_posterior_values)

# Get the parameter set that maximizes the log-posterior
par_max = reshaped_chain[max_log_p_index, :]

# Calculate residuals: observed data minus modeled data using par_max
modeled_data = model_monod(par_max, data_monod['C'])

residuals = data_monod['r'] - modeled_data

Plotting the residuals. We hope not to find any structure, but this model is clearly not perfect:

# Plot observed data (data_monod_r) versus residuals
plt.figure(figsize=(8, 4))
plt.scatter(data_monod['r'], residuals, alpha=0.7)
plt.xlabel("r")  # x-axis label
plt.ylabel("Residuals")  # y-axis label
plt.title("Observed Data vs. Residuals")

# Add a horizontal line at y = 0
plt.axhline(y=0, color='red', linestyle='--')

# Show the plot
plt.tight_layout()
plt.show()

Quantile-quantile plots are great to see if the residuals are normal distributed (which was out assumption for the likelihood functions). Here we see that we have too heavy tails.

# Create a figure and axis for the plot
plt.figure(figsize=(8, 4))

# Generate a Q-Q plot using scipy's probplot function
# probplot returns a tuple with the results and a dictionary with the plot data
stats.probplot(residuals, dist="norm", plot=plt)
## ((array([-1.8241636 , -1.38768012, -1.11829229, -0.91222575, -0.73908135,
##        -0.5857176 , -0.44506467, -0.31273668, -0.18568928, -0.06158146,
##         0.06158146,  0.18568928,  0.31273668,  0.44506467,  0.5857176 ,
##         0.73908135,  0.91222575,  1.11829229,  1.38768012,  1.8241636 ]), array([-1.54989379, -0.3899419 , -0.38391083, -0.34757682, -0.28075127,
##        -0.26961907, -0.13906543, -0.01490088, -0.01369238,  0.01181546,
##         0.08450874,  0.17331492,  0.18629004,  0.22739428,  0.29849992,
##         0.3329457 ,  0.3647414 ,  0.40396748,  0.49737373,  0.81014953])), (0.4742628491603166, 8.244065647266912e-05, 0.9232428290648674))
# Add labels and title
plt.xlabel("Theoretical Quantiles")
plt.ylabel("Sample Quantiles")
plt.title("Q-Q Plot of Residuals")

# Show the plot
plt.tight_layout()
plt.show()

We also assumed independence of the observations, i.e. we should not see any auto-correlation. This looks good here:

from statsmodels.tsa.stattools import acf

# Calculate the autocorrelation function (ACF) of the residuals
acf_values = acf(residuals, nlags=40, fft=True)

# Create a figure and axis for the plot
plt.figure(figsize=(8, 6))

# Plot the ACF values using a bar plot
plt.stem(range(len(acf_values)), acf_values)
plt.xlabel("Lag")
plt.ylabel("Autocorrelation")
plt.title("Autocorrelation Function (ACF) of Residuals")

# Add a horizontal line at y = 0 for reference
plt.axhline(y=0, color='gray', linestyle='--')

# Show the plot
plt.tight_layout()
plt.show()

3. Sample from the posterior for the growth model

The goal is to sample for the posterior distribution of the “Growth” model. This task is very similar to the previous one except that we provide less guidance.

  1. Read data model_growth.csv
  2. Define likelihood, prior, and posterior. What assumptions do you make?
  3. Run MCMC sampler
  4. Check convergence and mixing. If needed, modify the jump distribution
  5. Perform residual diagnostics. Are the assumption of the likelihood function ok?

Solution

R

TODO

Read data

Define likelihood, prior, and posterior

Run MCMC sampler

Check convergence and mixing

Residual diagnostics

Julia

TODO

Read data

Define likelihood, prior, and posterior

Run MCMC sampler

Check convergence and mixing

Residual diagnostics

Python

TODO

Read data

Define likelihood, prior, and posterior

Run MCMC sampler

Check convergence and mixing

Residual diagnostics